Back

The Journal of the Acoustical Society of America

Acoustical Society of America (ASA)

Preprints posted in the last 90 days, ranked by how well they match The Journal of the Acoustical Society of America's content profile, based on 33 papers previously published here. The average preprint has a 0.01% match score for this journal, so anything above that is already an above-average fit.

1
Orca vowels and consonants: convergent spectral structures across cetacean and human speech

Begus, G.; Holt, M.; Wright, B.; Gruber, D. F.

2026-03-02 animal behavior and cognition 10.64898/2026.02.27.708287 medRxiv
Top 0.1%
18.8%
Show abstract

The vocal communication system of orcas (Orcinus orca) has so far been analyzed primarily in terms of the fundamental frequency (F0) modulations, i.e. the frequency of their phonic lips vibration. The calls have been divided into clicks, pulsed calls, whistles and types thereof. By analyzing 61 hours of on-orca acoustic recordings and controlling for the effect of high-frequency components (HFC) and F0, we report structured formant patterns in orca vocalizations including diphthongal trajectories. Broadband spectrogram analysis reveals previously unreported formant patterns that appear independent of F0 and HFC and are hypothesized to result from air sac resonances. This study builds on the recent report of formant structure in vowel- and diphthong-like calls in another cetacean, sperm whales (Physeter macrocephalus). Using linguistic techniques, we further demonstrate that some calls are reminiscent of human consonant-vowel sequences, featuring bursts or abrupt decreases in amplitude. We also show that individual sparsely distributed clicks gradually transition into high frequency tonal calls, which aligns with analysis of sperm whale codas as vocalic pulses. The paper makes methodological contributions to the cetacean communication research by analyzing orca vocalizations with both narrowband and broadband spectrograms. The reported patterns are hypothesized to be actively controlled by whales and may carry communicative information. The spectral patterns shown in this study provide an added dimension to the orca communication system that merits further analysis and demonstrates convergent evolutions of similar phonological features in cetaceans (orca and sperm whale) and human communication systems.

2
Testing differential effects of periodicity and predictability in auditory rhythmic cueing of concurrent speech

MacLean, J.; Zhou, M.; Bidelman, G.

2026-03-13 neuroscience 10.64898/2026.03.11.711109 medRxiv
Top 0.1%
18.5%
Show abstract

Entrainment and predictive coding aid speech perception in both quiet and noisy environments. Isochronous, periodic auditory rhythmic cues facilitate entrainment and temporal expectations which can benefit encoding and perception of target speech. However, most studies using isochronous cues confound periodicity with predictability. To this end, we characterized how systematic changes in the acoustic dimensions of stimulus rate, target phase, periodicity, and predictably of an entraining sound precursor impact the subsequent identification of concurrent speech targets. Target concurrent vowel pairs were preceded by rhythmic woodblock cues which were either periodic-predictable (PP, isochronous rhythm), aperiodic-predictable (AP, accelerating rhythm), or aperiodic-unpredictable (AU, random rhythm). The number of pulses per rhythm was roved to further manipulate predictability. Stimuli also varied in presentation rate (2.5, 4.5, 6.5 Hz) and target speech phase (in-phase, 0{degrees}; out-of-phase, 90{degrees}, 180{degrees}) relative to the preceding entraining rhythm. We also measured participants musical pulse continuation and standardized speech-in-noise perception abilities. We did not observe any effects of stimulus rhythm, rate, or target phase on target speech identification accuracy. However, reaction times were slowest at the nominal speech rate (4.5 Hz) and were most disrupted by out-of-phase presentations following the PP rhythm. Double-vowel task performance was associated with stronger musical pulse continuation abilities, but not speech-in-noise perception. Our results support the notion that entraining rhythmic cues rely on top-down processing but are relatively muted when stimulus predictability is unknown. Additionally, we find that individual differences in musical pulse perception may underlie the benefits of rhythmic cueing on subsequent speech perception.

3
Modeling the Influence of Bandwidth and Envelope on Categorical Loudness Scaling

Neely, S. T.; Harris, S. E.; Hajicek, J. J.; Petersen, E. A.; Shen, Y.

2026-04-01 neuroscience 10.64898/2026.03.30.715393 medRxiv
Top 0.1%
14.2%
Show abstract

In a loudness-matching paradigm, a reduction in the loudness of sounds with bandwidths less than one-half octave compared to a tone of equal sound pressure level has been observed previously for five-tone complexes at 60 dB SPL centered at 1 kHz. Here, this loudness-reduction phenomenon is explored using band-limited noise across wide ranges of frequency and level. Additionally, these measurements are simulated by a model of loudness judgement based on neural ensemble averaging (NEA), which serves as a proxy for central auditory signal processing. Multi-frequency equal-loudness contours (ELC) were measured for each of the adult participants (N=100) with pure-tone average (PTA) thresholds that ranged from normal to moderate hearing loss using a categorical-loudness-scaling (CLS) paradigm. Presentation level and center frequency of the test stimuli were determined on each trial according to a Bayesian adaptive algorithm, which enabled multi-frequency ELC estimation within about five minutes of testing. Three separate test conditions differed by stimulus type: (1) pure-tone, (2) quarter-octave noise and (3) octave noise. For comparison, loudness judgements for all three stimulus types were also simulated by the NEA model, which comprised a nonlinear, active, time-domain cochlear model with an appended stage of neural spike generation. Mid-bandwidth loudness reduction was observed to be greatest at moderate stimulus levels and frequencies near 1 kHz. This feature was approximated by the NEA model, which suggests involvement of an early stage of the central auditory system in the formation of loudness judgements.

4
Impacts of heminode disruption on auditory processing of noisy sound stimuli

Tripathy, S.; Budak, M.; Maddox, R.; Mehta, A. H.; Roberts, M. T.; Corfas, G.; Booth, V.; Zochowski, M.

2026-02-04 neuroscience 10.64898/2026.02.02.703242 medRxiv
Top 0.1%
10.2%
Show abstract

Hidden hearing loss (HHL) is an auditory neuropathy characterized by altered auditory nerve responses despite normal hearing thresholds. Recent experimental and computational studies suggest that permanent disruptions to heminode positions in spiral ganglion neuron (SGN) fibers can contribute to these deficits. However, the interaction between heminode disruption and noisy backgrounds ubiquitous in daily listening remains unexplored. This study investigates how background noise affects auditory processing with these peripheral disorders and how deficits propagate to downstream sound localization circuits in the superior olivary complex. We developed computational models of SGN fibers with mild and severe degrees of heminode disruption, subjected to sinusoidal tone stimuli in the presence of background noise with varying spectral characteristics. We analyzed the phase-locking of SGN fiber responses to the stimulus tone and modeled the subsequent effects on interaural time difference (ITD) sensitivity in the medial superior olive (MSO) using a binaural localization network. We found that near-tone-frequency noise disrupted SGN phase locking through cycle-to-cycle variability in spike phases, with effects consistent across tone frequencies. Mild heminode disruption produced frequency-dependent degradation in SGN phase locking, with effects observed only at higher frequencies tested (600-1000 Hz), without reducing overall firing rates. Critically, the effects of noise and heminode disruption were additive, with combined exposure leading to reduced ITD sensitivity and large temporal fluctuations in MSO responses. Severe heminode disruption, which additionally reduced firing rates at the SGN fibers and subsequent stages, produced profound localization deficits across all frequencies tested. Thus, our model results suggest that noisy environments exacerbate auditory deficits from peripheral disorders implicated in HHL and could potentially impair speech intelligibility through degradation in localization ability. This model may be useful for understanding the downstream impacts of SGN neuropathies.

5
Discrimination of spectrally sparse complex-tone triads in cochlear implant listeners

Augsten, M.-L.; Lindenbeck, M. J.; Laback, B.

2026-03-24 neuroscience 10.64898/2026.03.20.712905 medRxiv
Top 0.1%
9.2%
Show abstract

Cochlear implant (CI) users typically experience difficulties perceiving musical harmony due to a restricted spectro-temporal resolution at the electrode-nerve interface, resulting in limited pitch perception. We investigated how stimulus parameters affect discrimination of complex-tone triads (three-voice chords), aiming to identify conditions that maximize perceptual sensitivity. Six post-lingually deafened CI listeners completed a same/different task with harmonic complex tones, while spectral complexity, voice(s) containing a pitch change, and temporal synchrony (simultaneous vs. sequential triad presentation) were manipulated. CI listeners discriminated harmonically relevant one-semitone pitch changes within triads when spectral complexity was reduced to three or five components per voice, with significantly better performance for three-component compared to nine-component tones. Sensitivity was observed for pitch changes in the high voice or in both high and low voices, but not for changes in only the low voice. Single-voice sensitivity predicted simultaneous-triad sensitivity when controlling for spectral complexity and voice with pitch change. Contrary to expectations, sequential triad presentation did not improve discrimination. An analysis of processor pulse patterns suggests that difference-frequency cues encoded in the temporal envelope rather than place-of-excitation cues underlie perceptual triad sensitivity. These findings support reducing spectral complexity to enhance chord discrimination for CI users based on temporal cues.

6
Sound lateralization Ability is affected by saccade direction but not Eye Movement-Related Eardrum Oscillations (EMREOs)

Sotero Silva, N.; Bröhl, F.; Kayser, C.

2026-02-05 neuroscience 10.1101/2025.11.05.686724 medRxiv
Top 0.1%
9.0%
Show abstract

Eye-movement-related eardrum oscillations (EMREOs) are pressure changes recorded in the ear that supposedly reflect displacements of the tympanic membrane induced by saccadic eye movements. Previous studies hypothesized that the underlying mechanisms might play a role in combining visual and acoustic spatial information. Yet, whether and how the eardrum moves during an EMREO and whether this movement affects acoustic spatial perception remains unclear. We here probed human acoustic lateralization performance for sounds presented at different times during a saccade (hence the EMREO) in two tasks, one relying on free-field sounds and one presenting sounds in-ear. Since the EMREO generation likely involves the middle ear muscles, whose tension can alter sound transmission, it is possible that judgements of sound locations may vary with the state of the ERMEO at the time of sound presentation. However, when testing two specific hypotheses of how movements of the eardrum underlying the EMREO may affect spatial hearing, we found no evidence in support of this. Still, and in line with previous studies, we found that participants lateralization responses were shaped by the spatial congruency of the saccade target direction and the sound direction. Thus, either the eardrum does not move directly as reflected by the EMREO signal, or despite its movement the underlying changes at the tympanic membrane only have minimal perceptual impact. Our results call for more refined studies to understand how the eardrum moves during a saccade and whether or how the EMREO impacts spatial perception.

7
Trial-By-Trial Auditory Brainstem Response Detection

Liu, G. S.; Ali, N.-E.-S.; O Maoileidigh, D.

2026-02-03 physiology 10.64898/2026.01.31.703019 medRxiv
Top 0.1%
8.4%
Show abstract

The neural response of the brainstem to brief sounds, known as the auditory brainstem response (ABR), is widely employed in the laboratory and the clinic to diagnose hearing loss. In contrast to behavioral methods that assess hearing using responses to sounds on a trial-by-trial basis, current ABR approaches are limited to analyzing the average ABR over hundreds of trials. Historically, trial-by-trial ABR analysis has not been possible owing to each trials small signal-to-noise ratio. Here we overcome this limitation and show how to classify individual ABR trials as detected or undetected. We use the distribution of single-trial ABRs to assess supra-threshold hearing and to define psychophysics-like thresholds, which we call auditory brainstem detection (ABD) thresholds. ABD thresholds decrease as more of the ABR epoch is taken into account, whereas traditional ABR thresholds do not change. Above the ABD thresholds and below 90 dB SPL, signal detection is significantly improved by utilizing more of the ABR epoch. Our method also allows us to rank the supra-threshold hearing ability of individual subjects. Despite having normal ABR thresholds, some subjects appear to have supra-threshold hearing deficits. The trial-by-trial method demonstrates that signal detection by the ensemble of auditory neurons in the brainstem is intrinsically stochastic not only at low stimulus levels, but also at levels up to 100 dB SPL. Significance StatementNeural responses to sound can be measured by electrodes placed on a subjects head and are commonly used in the laboratory and the clinic to assess hearing. Although the auditory system must distinguish each sound stimulus from intrinsic noise, current methods for ana-lyzing the response of the brainstem to sound only utilize the average response to hundreds of stimuli. Here we overcome this constraint by showing how to classify an individual sound stimulus as detected or undetected based on each auditory brainstem response. This ap-proach can assess hearing at all stimulus levels, indicates that subjects with normal hearing thresholds can exhibit supra-threshold hearing loss, and potentially extends the types of hearing deficits that can be diagnosed using auditory evoked potentials.

8
Automatic parameter estimation and detection of ringed seal knocking vocalizations

Solana, A.; Young, M.; Nadeu, C.; Kunnasranta, M.; Houegnigan, L.

2026-01-29 ecology 10.1101/2024.05.06.592639 medRxiv
Top 0.1%
6.9%
Show abstract

Passive acoustic monitoring is a valuable tool for studying elusive marine mammals, but analyzing large datasets is typically labor-intensive and costly. In this study, we piloted an automatic approach for sound analysis on extensive datasets of acoustic underwater recordings from freshwater Lake Saimaa over a total of 12 months. Our focus was on "knocking" vocalizations, the most commonly found call type of the endangered Saimaa ringed seal (Pusa saimensis). The annotated datasets of knock sounds (n = 13,179) were used to train and test binary classification systems to detect this sound type. In addition, the fundamental frequencies of the vocalizations were automatically estimated by an ensemble of methods and corroborated by recent literature. The best classifier was a spectrogram-based convolutional neural network that achieved a minimum F1-score of 97.76% on unseen samples from each dataset, demonstrating its ability to detect knockings amongst noise and other events. Moreover, the estimated fundamental frequencies are comparable to the ones manually computed for the same datasets. These automated approaches can significantly reduce labor and costs associated with manual analysis, making long-term species monitoring more feasible and efficient.

9
From sound to source: Human and model recognition of environmental sounds

Alavilli, S.; McDermott, J. H.

2026-03-14 neuroscience 10.64898/2026.03.12.711349 medRxiv
Top 0.1%
4.8%
Show abstract

Our ability to recognize sound sources in the world is critical to daily life, but is not well documented or understood in computational terms. We developed a large-scale behavioral benchmark of human environmental sound recognition, built stimulus-computable models of sound recognition, and used the benchmark to compare models to humans. The behavioral benchmark measured how sound recognition varied across source categories, audio distortions, and concurrent sound sources, all of which influenced recognition performance in humans. Artificial neural network models trained to recognize sounds in multi-source scenes reached near-human accuracy and qualitatively matched human patterns of performance in many conditions. By contrast, traditional models of the cochlea and auditory cortex that were trained to recognize sounds produced worse matches to human performance. Models trained on larger datasets exhibited stronger alignment with both human behavior and brain responses. The results suggest that many aspects of human sound recognition emerge in systems optimized for the problem of real-world recognition. The benchmark results set the stage for future explorations of auditory scene perception involving salience and attention.

10
Population decoding of sound source location by receptive field neurons in the mouse superior colliculus

Mullen, B. R.; Litke, A. M.; Feldheim, D. A.

2026-01-27 neuroscience 10.64898/2026.01.26.701861 medRxiv
Top 0.1%
4.1%
Show abstract

Identifying the location of a sound source in a complex environment and assessing its importance can be crucial for survival. The superior colliculus (SC), a midbrain structure involved in sensorimotor functions, contributes to sound localization and contains auditory responsive neurons that have spatially restricted receptive fields (RFs) that are organized into a topographic map along the azimuth. However, individual auditory SC neurons have large spatial RFs, are noisy, and do not respond to the same stimulus at each trial. Therefore, when an animal is presented with a "single trial" sound, and it needs to rely on a single neuron to locate the sound source direction, the location measurement may be erroneous, missing, or have poor spatial resolution. It is expected that a more reliable and accurate determination of the sound source location will come from a population of neurons. We therefore built a population pattern Maximum Likelihood Estimation (MLE) decoder to build a model that can accurately predict the location of a stimulus given the population response. We compared three models that use either strict firing rate (FR), weighting based on equal (EW) or mutual information (MIW) and show that the MIW model works best, needing only 92 neurons to localize a stimulus with behaviorally relevant precision. Furthermore, by comparing the models fit using the responses from non-RF and RF auditory neurons, we show that only RF neurons contain the information needed to localize a sound source. These results are consistent with the hypothesis that the SC uses a population of RF neurons to determine sound source location. Author SummaryBeing able to tell where a sound is coming from and how important it is can be critical for survival. The superior colliculus, a midbrain region involved in orienting behaviors, contains neurons that respond best to sounds coming from specific locations. This suggests that the combined activity of many neurons in the SC is used to determine sound location from a single sound event. To test this idea, we modeled responses from mouse SC neurons while sounds were played from different positions in space, both along the elevation and horizon. A model that weighted the most informative neurons performed best in both directions needing only 92 neurons to localize a stimulus with behaviorally relevant precision along the azimuth. Comparing the models fit using the responses from non-RF and RF auditory neurons, we show that only RF neurons contain the information needed to localize a sound source Overall, our findings show that the SC can accurately locate sounds in both horizontal and vertical space using a population-based strategy, providing a simple and effective solution for rapid sound localization.

11
Acoustic Salience Drives Pupillary Dynamics in an Interrupted, Reverberant Task

Figarola, V.; Liang, W.; Luthra, S.; Parker, E.; Winn, M.; Brown, C.; Shinn-Cunningham, B. G.

2026-04-02 neuroscience 10.64898/2026.03.31.715639 medRxiv
Top 0.1%
3.6%
Show abstract

Listeners face many challenges when trying to maintain attention to a target source in everyday settings; for instance, reverberation distorts acoustic cues and interruptions capture attention. However, little is known about how these challenges affect the ability to maintain selective attention. Here, we measured syllable recall accuracy and pupil dilation during a spatial selective attention task that was sometimes disrupted. Participants heard two competing, temporally interleaved syllable streams presented in pseudo-anechoic or reverberant environments. On randomly selected trials, a sudden interruption occurred mid-sequence. Compared to anechoic trials, reverberant performance was worse overall, and the interrupter disrupted performance. In uninterrupted trials, reverberation reduced peak pupil dilation both when it was consistent across all stimuli in a block and when it was randomized trial to trial, suggesting temporal smearing reduced clarity of the scene and the salience of events in the ongoing streams. Pupil dilations in response to interruptions indicated perceptual salience was strong across reverberant and anechoic conditions. Specifically, baseline pupil size before trials did not vary across room conditions, and mixing or blocking of trials (altering stimulus expectations) had no impact on pupillary responses. Together, these findings highlight that stimulus salience drives cognitive load more strongly than does task performance.

12
Hearing sounds when the eyes move: A case study implicating the tensor tympani in eye movement-related peripheral auditory activity

King, C. D.; Zhu, T.; Groh, J. M.

2026-03-25 neuroscience 10.64898/2026.03.24.713974 medRxiv
Top 0.1%
3.2%
Show abstract

Information about eye movements is necessary for linking auditory and visual information across space. Recent work has suggested that such signals are incorporated into processing at the level of the ear itself (Gruters, Murphy et al. 2018). Here we report confirmation that the eye movement signals that reach the ear can produce perceptual consequences, via a case report of an unusual participant with tensor tympani myoclonus who hears sounds when she moves her eyes. The sounds she hears could be recorded with a microphone in the ear in which she hears them (left), and occurred for large leftward eye movements to extreme orbital positions of the eyes. The sounds elicited by this participants eye movements were reminiscent of eye movement-related eardrum oscillations (EMREOs, (Gruters, Murphy et al. 2018, Brohl and Kayser 2023, King, Lovich et al. 2023, Lovich, King et al. 2023, Lovich, King et al. 2023, Abbasi, King et al. 2025, Sotero Silva, Kayser et al. 2025, King and Groh 2026, Leon, Ramos et al. 2026, Sotero Silva, Brohl et al. 2026)), but were larger and longer lasting than classical EMREOs, helping to explain why they were audible to her. Overall, the observations from this patient help establish that (a) eye movement-related signals specifically reach the tensor tympani muscle and that (b) when there is an abnormality involving that muscle, such signals can lead to actual audible percepts. Given that the tensor tympani contributes to the regulation of sound transmission in the middle ear, these findings support that eye movement signals reaching the ear have functional consequences for auditory perception. The findings also expand the types of medical conditions that produce gaze-evoked tinnitus, to date most commonly observed in connection with acoustic neuromas.

13
Active strains in the basal organ of Corti in gerbil

Wong, K. H.; Strimbu, C. E.; Olson, E. S.

2026-01-30 biophysics 10.64898/2026.01.27.702095 medRxiv
Top 0.1%
3.1%
Show abstract

Optical coherence tomography (OCT) has allowed in vivo recording of sound-induced vibrations of different regions within the organ of Corti complex (OCC), including the basilar membrane (BM), outer hair cell/Deiters cell (OHC/DC) region, and reticular lamina (RL). In the hook region of the gerbil cochlea, where measurements can be made with a substantially transverse optical axis, the three regions have different and characteristic motion responses: The OHC/DC region has greater motions than the other two regions at frequencies below the best frequency (sub-BF); the RL region typically has the greatest BF peak and smallest sub-BF motion. The phase of the OHC/DC-region motion increasingly lags BM motion phase as frequency increases; the RL-region motion phase leads BM, but with a relatively small value. All three regions are compressively nonlinear in the BF peak, but only the OHC/DC region shows sub-BF compressive nonlinearity. In this paper, we describe the strain that exists within the RL and OHC-body regions. These strains are large where the motion varies over short distances, and a region of large strain can be as short as a single 2.7 {micro}m measurement pixel, or extend over several pixels, with the extensive strains appearing more often at 70 than at 50 dB SPL. Beyond the region of large strain, over a distance that can exceed 20 m, the OHC/DC region displays nearly unvarying motion spatially -- this region appears to vibrate as a body. Statement of SignificanceThe sensory tissue of the cochlea responds actively to a sound stimulus: cell-based forces amplify and enhance the vibration of the sensory tissue. Measurements employing optical coherence tomography have identified major vibration patterns along a sensory-tissue-spanning line that includes the active outer hair cells. In this article, we describe the transitional motion between these major vibration regions and the motion strains that exist as vibration morphs from one region to the next. The findings are presented in frequency response curves to convey the frequency tuning and its stimulus-level dependence, and in one-dimensional heat maps to convey the extent of regional motions and strains. These findings fuel and constrain conceptual and physics-based models of cochlear amplification.

14
BioDCASE: Using data challenges to make community advances in computational bioacoustics

Stowell, D.; Nolasco, I.; McEwen, B.; Vidana Vila, E.; Jean-Labadye, L.; Benhamadi, Y.; Lostanlen, V.; Dubus, G.; Hoffman, B.; Linhart, P.; Morandi, I.; Cazau, D.; White, E.; White, P.; Miller, B.; Nguyen Hong Duc, P.; Schall, E.; Parcerisas, C.; Gros-Martial, A.; Moummad, I.

2026-04-06 animal behavior and cognition 10.64898/2026.04.02.716062 medRxiv
Top 0.1%
2.7%
Show abstract

Computational bioacoustics has seen significant advances in recent decades. However, the rate of insights from automated analysis of bioacoustic audio lags behind our rate of collecting the data - due to key capacity constraints in data annotation and bioacoustic algorithm development. Gaps in analysis methodology persist: not because they are intractable, but because of resource limitations in the bioacoustics community. To bridge these gaps, we advocate the open science method of data challenges, structured as public contests. We conducted a bioacoustics data challenge named BioDCASE, within the format of an existing event (DCASE). In this work we report on the procedures needed to select and then conduct useful bioacoustics data challenges. We consider aspects of task design such as dataset curation, annotation, and evaluation metrics. We report the three tasks included in BioDCASE 2025 and the resulting progress made. Based on this we make recommendations for open community initiatives in computational bioacoustics.

15
Peripheral phoneme encoding and discrimination in aging and hearing impairment

Wouters, M.; Gaudrain, E.; Dapper, K.; Schirmer, J.; Baskent, D.; Ruettiger, L.; Knipper, M.; Verhulst, S.

2026-01-28 neuroscience 10.64898/2026.01.27.702044 medRxiv
Top 0.1%
2.4%
Show abstract

Speech perception difficulties in noise are common among older adults and individuals with hearing impairment, even when audiometric thresholds appear normal. We examined how aging, cochlear synaptopathy (CS), and outer hair cell (OHC) damage affect speech encoding and phoneme discrimination. Envelope-following responses (EFRs) to rectangular amplitude-modulated (RAM) tones and speech-like phoneme pairs were recorded in quiet using EEG, and behavioral discrimination was assessed in quiet, ipsilateral, and contralateral noise. Stimuli were designed to target temporal envelope (TENV) or temporal fine structure (TFS) encoding. Results showed that RAM-EFR amplitudes decreased gradually with age, consistent with emerging CS, while magnitudes of high-frequency TENV-based EFRs in quiet were most reduced in older hearing-impaired listeners with combined CS and OHC damage. In contrast, EFRs targeting low-frequency TENV encoding in quiet remained preserved. Behaviorally, phoneme discrimination of TFS contrasts worsened with OHC loss and age in quiet and contralateral noise, respectively, while there was no significant effect of age on the discrimination of TENV contrasts. Considering that high-frequency contrasts are discriminated via place-based spectral cues, low-frequency contrasts rely on TFS, and the EFR reflects primarily TENV, this framework explains why EFRs decline for high-frequency cues without perceptual loss, while EFRs remain stable for low-frequency cues even as TFS-based discrimination deteriorates. These findings highlight the need for further investigation into how neural coding deficits relate to perceptual outcomes. Combining electro-physiological and behavioral measures might provide a sensitive framework for detecting subclinical auditory deficits to earlier diagnose age-related and hidden hearing loss. HighlightsO_LISpeech-evoked EEG shows OHC loss-related decline of high-CF enve- lope encoding. C_LIO_LISpeech-evoked EEG shows low-CF envelope encoding stays intact with age. C_LIO_LIFine-structure contrast discrimination worsens with OHC loss in quiet. C_LIO_LIFine-structure contrast discrimination worsens with age in contralateral noise. C_LIO_LIHigh-frequency place-based spectral cues discrimination remains robust with age. C_LIO_LIPeripheral coding strength is not directly reflected at behavioral level. C_LI

16
Potential risk for hearing from prolonged exposure to sound at conversation levels

Xue, W.; Sun, N.; Wood, E.; Xie, J.; Liu, X.; Yan, J.

2026-03-02 neuroscience 10.64898/2026.02.26.708062 medRxiv
Top 0.1%
2.2%
Show abstract

Prolonged exposure to loud and moderate noise impairs hearing; the lower the noise level, the lower risk of hearing loss is. To date, little is known about how low the noise level can be safe to hearing. This study investigated the risk of exposure to tone at typical conversational levels by measuring auditory brainstem response (ABR). We show that exposing C57 mice to continuous pure tone at 65 dB SPL for 1 hour (TE65) leads to an increase in ABR threshold that is specific to the exposure frequency. Tone exposure also increased the latencies and decreased the amplitude in Waves I and II but not in Waves III and V. Significantly, the changes in amplitude and latency were highly correlated in Wave I and such correlation gradually degraded from Wave I through to Wave V. Our findings suggest that exposure to low level sound can impair hearing and alter the auditory information process in the brain if it is persistent and presented over a sufficient period of time. Significant StatementOur findings established the risk of hearing impairment following the exposure to continuous tone at normal or conversational voice levels. This finding challenges current public health guidelines for hearing protection. Although further clarification is required, our studies prompt that the regular use of ABR testing is a potential protocol for diagnosing hearing impairment in patients experiencing hidden hearing loss (HHL).

17
EEG correlates of auditory rise time processing: A systematic review

Manasevich, V.; Kostanian, D.; Rogachev, A.; Sysoeva, O.

2026-03-09 neuroscience 10.64898/2026.03.06.710012 medRxiv
Top 0.1%
2.1%
Show abstract

Rise time (RT) is considered to be one of the most significant acoustical characteristics of auditory speech stimuli. A substantial amount of data has been accumulated on the neurophysiological mechanisms of RT processing under different conditions and in different groups of people, but these data have not been systematised. This review focuses on studies that have investigated electroencephalographic (EEG) markers of RT sensitivity. The present literature search was conducted according to the PRISMA statement in PubMed, Web of Science and APA PsychInfo databases. The resultant review comprised 37 studies that considered diverse aspects of RT processing. The review describes the main stimulation parameters affecting electrophysiological markers of RT processing reflected in different components of event-related potentials, brainstem responses and cortical rhythmic activity. The main finding of this review is that the rise time prolongation leads to a decrease in the amplitude of the main ERP components and an increase in their latencies. However, the sensitivity of the EEG markers varied with the earliest components tracking the subtle difference (few tens of microseconds), while the later components coding the larger one (up to 500 ms). Nevertheless, the observed effects may vary and depend on some aspects of the experimental paradigm, age of participants and speech-related problems. Future research may benefit by addressing understudied clinical groups and ERP components such as P1 and N2, dominated in children.

18
Improving Automated Diagnosis of Middle and Inner Ear Pathologies by Estimating Middle Ear Input Impedance from Wideband Tympanometry

Kamau, A. F.; Merchant, G. R.; Nakajima, H. H.; Neely, S. T.

2026-03-31 otolaryngology 10.64898/2026.03.26.26349034 medRxiv
Top 0.1%
1.9%
Show abstract

Conductive hearing loss (CHL) with a normal otoscopic exam can be difficult to diagnose because routine clinical measures such as audiometric air-bone gaps (ABGs) can identify a conductive component but often cannot distinguish among specific underlying mechanical pathologies (e.g., stapes fixation versus superior canal dehiscence, which may produce similar audiograms). Wideband tympanometry (WBT) is a fast, noninvasive test that can provide additional mechanical information across a broad range of frequencies (200 Hz to 8 kHz). However, WBT metrics are influenced by variations in ear canal geometry and probe placement and can be challenging to interpret clinically. In this study, we extend prior WBT absorbance-based classification work by estimating the middle ear input impedance at the tympanic membrane (ZME), a WBT-derived metric intended to reduce ear canal effects. To estimate ZME, we fit an analog circuit model of the ear canal, middle ear, and inner ear to raw WBT data collected at tympanometric peak pressure (TPP). Data from 27 normal ears, 32 ears with superior canal dehiscence, and 38 ears with stapes fixation were analyzed. A multinomial logistic regression classifier was trained using principal component analysis (retaining 90% variance) and stratified 5-fold cross-validation with regularization. We compared feature sets based on ABGs alone, ABGs combined with absorbance, and ABGs combined with the magnitude of ZME. The combination of ABGs and the magnitude of ZME produced the best performance, achieving an overall accuracy of 85.6% compared to 80.4% for ABGs alone and 78.4% for ABGs combined with absorbance. These results suggest that incorporating model-derived middle ear impedance features with standard audiometric measures (ABGs) can improve automated pathology classification for stapes fixation and superior canal dehiscence.

19
Ontogeny of vocalizations in Adelie penguin (Pygoscelis adeliae) chicks

Adams, M. L.; Fradet, D. T.; Cimino, M. E.; White, E. R.; Kloepper, L. N.

2026-01-25 zoology 10.64898/2026.01.23.701401 medRxiv
Top 0.1%
1.8%
Show abstract

Passive acoustic monitoring (PAM) is an efficient method to monitor dense aggregations of vocal animals but requires understanding the acoustic ecology of the species under examination. Avian vocal development is largely understood from songbirds, limiting its application to non-passerine taxa with different social and environmental pressures. As an example, colonial seabirds such as the Adelie penguin (Pygoscelis adeliae) inhabit acoustically crowded environments and rely on vocal cues, in addition to spatial information, for parent-offspring recognition. While adult penguin vocal communication is well studied, chick vocal development remains poorly characterized. Using the deep learning-based system DeepSqueak, we aimed to characterize the vocal development of wild P. adeliae chicks in the West Antarctic Peninsula. We found that acoustic features of chick calls changed systematically with age, with calls becoming longer and more frequency modulated over time. Characterizing chick vocal development from hatch to fledge provides important information to study phenological communication patterns in vocal-dependent seabirds and supports the application of PAM to assess climate-driven impacts on indicator species.

20
Vocal Signatures of Stress Relief: Effects of Appeasing Harness and Synthetic Pheromone on Puppy Whine Acoustics in Separation Context (Canis familiaris)

Philippe, R.; Le-Bourdiec-Shaffi, A.; Kaltsatos, V.; Reby, D.; Massenet, M.

2026-04-06 animal behavior and cognition 10.64898/2026.04.02.715714 medRxiv
Top 0.1%
1.7%
Show abstract

In mammals, loud, high-pitched, and harsh-sounding calls typically accompany heightened emotional arousal, particularly during distress such as separation. However, whether subtle arousal reductions can be detected through acoustic analysis within a single negative context remains unclear. We investigated whether source-related acoustic parameters of puppy whines reflect arousal modulations induced by calming interventions during maternal separation. Thirty-five eight-week-old Beagle puppies were recorded under four conditions combining synthetic appeasing pheromone and a pressure harness. Vocal behavior, activity, whine duration, and intensity, did not significantly differ across treatments, suggesting interventions did not suppress separation-related vocal responses. Nevertheless, calming products selectively altered acoustic parameters known to index arousal in dog vocalizations. Puppies receiving combined treatments produced whines with lower fundamental frequency (fo) and reduced fo variability, while pheromone exposure increased call tonality, reflected by reduced jitter and shimmer and elevated harmonics-to-noise ratios. Spectral entropy remained unchanged, possibly because the proportion of whines containing nonlinear phenomena did not vary across conditions. Reductions in fo, fo variability, and acoustic roughness are consistent with established correlates of lower arousal in mammals, suggesting source-related vocal parameters sensitively capture subtle arousal shifts even when overt vocal behavior remains stable, supporting their use as bioacoustic indicators for evaluating welfare interventions.